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Abstract 

In lossy source coding with side information at the decoder (i.e., the Wyner-Ziv problem), the 
estimate of the source obtained at the decoder cannot be generally reproduced at the encoder, due to 
its dependence on the side information. In some applications this may be undesirable, and a Common 
Reconstruction (CR) requirement, whereby one imposes that the encoder and decoder be able to agree on 
the decoder's estimate, may be instead in order. The rate-distortion function under the CR constraint has 
been recently derived for a point-to-point (Wyner-Ziv) problem. In this paper, this result is extended to 
three multiterminal settings with three nodes, namely the Heegard-Berger (HB) problem, its variant with 
cooperating decoders and the cascade source coding problem. The HB problem consists of an encoder 
broadcasting to two decoders with respective side information. The cascade source coding problem is 
characterized by a two-hop system with side information available at the intermediate and final nodes. 

For the HB problem with the CR constraint, the rate-distortion function is derived under the 
assumption that the side information sequences are (stochastically) degraded. The rate-distortion function 
is also calculated explicitly for three examples, namely Gaussian source and side information with 
quadratic distortion metric, and binary source and side information with erasure and Hamming distortion 
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metrics. The rate-distortion function is then characterized for the HB problem with cooperating decoders 
and (physically) degraded side information. For the cascade problem with the CR constraint, the rate- 
distortion region is obtained under the assumption that side information at the final node is physically 
degraded with respect to that at the intermediate node. For the latter two cases, it is worth emphasizing 
that the corresponding problem without the CR constraint is still open. Outer and inner bounds on 
the rate-distortion region are also obtained for the cascade problem under the assumption that the side 
information at the intermediate node is physically degraded with respect to that at the final node. For the 
three examples mentioned above, the bounds are shown to coincide. Finally, for the HB problem, the 
rate-distortion function is obtained under the more general requirement of constrained reconstruction, 
whereby the decoder's estimate must be recovered at the encoder only within some distortion. 

Index Terms 

Common reconstruction, source coding with side information, Heegard-Berger problem, cascade 
source coding. 



I. Introduction 

Source coding problems with side information at the decoder(s) model a large number of 
scenarios of practical interest, including video streaming [H] and wireless sensor networks [2]. 
From an information theoretic perspective, the baseline setting for this class of problems is one 
in which a memoryless source X" = (Xi, X„) is to be communicated by an encoder at a rate 
R bits per source symbol to a decoder that has available a correlated sequence that is related 
to X" via a memoryless channel p{y\x) (see Fig. [l£). Under the requirement of asymptotically 
lossless reconstruction X" of the source X" at the decoder, the minimum required rate was 
obtained by Slepian and Wolf in [[3l. Later, the more general optimal trade-off between rate R 
and the distortion D between the source X" and reconstruction X" was obtained by Wyner and 
Ziv in [4] for any given distortion metric d{x, x). It was shown to be given by the rate-distortion 
function 

R^UD) =min I{X-U\Y), (1) 

where the minimum is taken over all probability mass functions (pmfs) p{u\x) and deterministic 
function x{u,y) such that E[d{X, x{U,Y))] < D. 

*The presence of the function ip at the encoder will be explained later. 
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Figure 1. Point-to-point source coding with common reconstruction (5). 



A. Heegard-Berger and Cascade Source Coding Problems 

In applications such as the ones discussed above, the point-to-point setting of Fig. [T] does not 
fully capture the main features of the source coding problem. For instance, in video streaming, 
a transmitter typically broadcasts information to a number of decoders. As another example, in 
sensor networks, data is typically routed over multiple hops towards the destination. A model 
that accounts for the aspect of broadcasting to multiple decoders is the Heegard-Berger (HB) 
set-up shown in Fig. [21 In this model, the link of rate R bits per source symbol is used to 
communicate to two receivers having different side information sequences, and Y2, which 
are related to source X" via a memoryless channel p{yi, 2/2 1 2;)- The set of all achievable triples 
(R, Di, D2) for this model, where Di and D2 are the distortion levels at Decoders 1 and 2 
respectively, was derived in |l6l and [|7]| under the assumption that the side information sequences 
are (stochastically) degraded versions of the source X". In a variation of this model shown in 
Fig. [3l decoder cooperation is enabled by a limited capacity link from one decoder (Decoder 1) 
to the other (Decoder 2). Inner and outer bounds to the rate distortion region for this problem 
are obtained in ^ under the assumption that the side information of Decoder 2 is (physically) 
degraded with respect to that of Decoder 1 . 

As for multihopping, a basic model that captures some of the key design issues is shown in 
Fig.m In this cascade set-up, an encoder (Node 1) communicates with rate Ri to a intermediate 
node (Node 2), which has side information F", and in turns communicates with rate R2 to a 
final node (Node 3) with side information Y2. Both Node 2 and Node 3 act as decoders, similar 
to the HB problem of Fig. |2l in the sense that they reconstruct a local estimate of the source X". 
The rate-distortion function for this problem has been derived for various special cases in [,9J, 
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Figure 2. Heegard-Berger source coding problem with common reconstruction. 
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Figure 3. Heegard-Berger source coding problem with common reconstruction and decoder cooperation. 



IfTOl . IfTTTl and [fT2l| (see Table I in [[T2l | for an overview). Reference [fTTll derives the set of all 
achievable quadruples (_Ri, R2, Di, D2), i.e., the rate-distortion region, for the case in which 
is also available at the encoder and Y2 is a physically degraded version of X" with respect to 
Y]". Instead, [10] derives the rate-distortion region under the assumptions that the source and the 
side information sequences are jointly Gaussian, that the distortion metric is quadratic, and that 
the sequence y" is a physically degraded version of X" with respect to ¥2- The corresponding 
result for binary source and side information and Hamming distortion metric was derived in 

m. 
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Figure 4. Cascade source coding problem with common reconstruction. 



B. Common Reconstruction Constraint 

A key aspect of the optimal strategies identified in (H, |l6l, Q, HOl and HHl is that the 
side information sequences are, in general, used in two different ways: (/) as a means to reduce 
the rate required for communication between encoder and decoders via binning; and (ii) as 
an additional observation that the decoder can leverage, along with the bits received from the 
encoder, in order to improve its local estimate. For instance, for the point-to-point system of 
Fig. [H the Wyner-Ziv result © reflects point (?) of the discussion above in the conditioning on 
side information Y, which reduces the rate, and point (ii) in the fact that the reconstruction X 
is a function x{U, Y) of the signal U received from the encoder and the side information Y. 

Leveraging the side information as per point (ii), while advantageous in terms of rate-distortion 
trade-off, may have unacceptable consequences for some applications. In fact, this use of side 
information entails that the reconstruction X of the decoder cannot be reproduced at the encoder. 
In other words, encoder and decoder cannot agree on the specific reconstruction X obtained at 
the receiver side, but only on the average distortion level D. In applications such as transmission 
of sensitive medical, military or financial data, this may not be desirable. Instead, one may want 
to add the constraint that the reconstruction at the decoder be reproducible by the encoder [[5l. 
This idea, referred to as the Common Reconstruction (CR) constraint, was first proposed in [[5]|, 
where it is shown for the point-to-point setting of Fig. \M that the rate-distortion function under 

^The function ip at the encoder calculates the estimate of the encoder regarding the decoder's reconstruction. 
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the CR constraint is given by 

R'^friD) =min /(X;X|F), (2) 

where the minimum is taken over all pmfs p{x\x) such that E[(i(X, X)] < D. Comparing ^ 
with the Wyner-Ziv rate-distortion ([T]), it can be seen that the additional CR constraint prevents 
the decoder from using the side information as a means to improve its estimate X (see point 
(zz) above). 

The original work of |[5l has been recently extended in lfT3l . where a relaxed CR constraint 
is imposed in which only a distortion constraint is imposed between the decoder's reconstruc- 
tion and its reproduction at the encoder. We refer to this setting as imposing a Constrained 
Reconstruction (ConR) requirement. 

C. Main Contributions 

In this paper, we study the HB source coding problem (Fig. [21) and the cascade source coding 
problem (Fig. HJ under the CR requirement. The considered models are thus relevant for the 
transmission of sensitive information, which is constrained by CR, via broadcast or multi-hop 
links - a common occurrence in, e.g., medical, military or financial applications (e.g., for intranets 
of hospitals or financial institutions). Specifically, our main contributions are: 

• For the HB problem with the CR constraint (Fig. [2l), we derive the rate-distortion function 
under the assumption that the side information sequences are (stochastically) degraded. We 
also calculate this function explicitly for three examples, namely Gaussian source and side 
information with quadratic distortion metric, and binary source and erasure side information 
with erasure and Hamming distortion metrics (Sec. HIl); 

• For the HB problem with the CR constraint and decoder cooperation (Fig. [3]), we derive 
the rate-distortion region under the assumption that the side information sequences are 
(physically) degraded in either direction (Sec. IIII- Al and Sec. IIII-B|) . We emphasize that the 
corresponding problem without the CR constraint is still open as per the discussion above; 

• For the cascade problem with the CR constraint (Fig. HJ, we obtain the rate-distortion region 
under the assumption that side information I2 is physically degraded with respect to Yi (Sec. 
IIV-BI) . We emphasize that the corresponding problem without the CR constraint is still open 
as per the discussion above; 
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• For the cascade problem with CR constraint (Fig. Hj), we obtain outer and inner bounds on 
the rate-distortion region under the assumption that the side information Yi is physically de- 
graded with respect to Y2. Moreover, for the three examples mentioned above in the context 
of the HB problem, we show that the bounds coincide and we evaluate the corresponding 
rate-distortion region explicitly (Sec. IIV-CI) : 

• For the HB problem, we finally derive the rate-distortion function under the more general 
requirement of ConR (Sec. IVT). 

Notation: For a and b integer with a < b we define [a, b] as the interval [a, a + 1, b] and we 
use a;^ to denote the sequence (xa, ■ ■ ■ ,Xb). We will also write for x\ for simplicity. Upper 
case, lower case and calligraphic letters denote random variables, specific values of random 
variables and their alphabets, respectively. Given discrete random variables, or more generally 
vectors, X and Y, we will use the notation px{x) or p{x) for Pr[X = x], and Px\Y{x\y) 
or p{x\y) for Pr[X = x\Y = y], where the latter notations are used when the meaning is 
clear from the context. Given a set X, we denoted by X" the n-fold Cartesian product of X. 
For random variables X and Y, we denote by crj^^y (average) conditional variance of X 
given Y, i.e., E [E[(X — E[X|F])^|F]] . We adopt the notation convention in [fT4ll . in which 6{e) 
represents any function such that (5(e) — )► as e — t- 0. We define the binary entropy function 
H{p) = —p\og2P — (1 — p)lo(72(l — p)- Finally, we define a * (3 = a{l — (3) + /3(1 — a). 

II. Heegard-Berger Problem with Common Reconstruction 

In this section, we first detail the system model for the HB source coding problem in Fig. [2] 
with CR in Sec. III-A[ Next, the characterization of the corresponding rate-distortion performance 
is derived under the assumption that one of the two side information sequences is a stochastically 
degraded version of the other in the sense of [6] (see (flOl)). Finally, three specific examples are 
worked out, namely Gaussian sources under quadratic distortion (Sec. III-CI) . and binary sources 
with side information sequences subject to erasures under Hamming or erasure distortion (Sec. 

A. System Model 

In this section the system model for the HB problem with CR is detailed. The system is defined 
by the pmf pxYiVii^, Hi, 1/2) and discrete alphabets X, 3^i, 3^2, ^i, and X2 as follows. The source 
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sequence and side information sequences and Y2, with X'^ e A"", e , and 1^2" £ 
are such that the tuples (Xj,Yij,F2i) for ^ G [l,n] are independent and identically distributed 
(i.i.d.) with joint pmf pxyiY2{^^ Vij ^2)- The encoder measures a sequence X" and encodes it into 
a message J of nR bits, which is delivered to the decoders. Decoders 1 and 2 wish to reconstruct 
the source sequence X^ within given distortion requirements, to be discussed below, as X" e A"" 
and X2 e '^2' respectively. The estimated sequence Xj' is obtained as a function of the message 
J and the side information sequence for j = 1,2. The estimates are constrained to satisfy 
distortion constraints defined by per-symbol distortion metrics dj{x,Xj) : X x Xj ^ [0, -Dmaa;] 
with < Drnax < oo. Bascd on the given distortion metrics, the overall distortion for the 
estimated sequences x'^ and £'2 is defined as 

1 " 

d]{x^, x]) = - V d,{x^, %0 for J = 1, 2. (3) 

i=l 

The reconstructions X2 X2 ^^so required to satisfy the CR constraints, as formalized 
below. 

Definition 1. An {n,R,Di,D2,e) code for the HB problem with CR consists of an encoding 
function 

g:X-^[lX% (4) 
which maps the source sequence X" into a message J; a decoding function for Decoder 1, 

hi:[lX^\^yi ^ ^1. (5) 

which maps the message J and the side information into the estimated sequence X"; a 
decoding function for Decoder 2 

h2:[lX'^]xy^ ^ (6) 

which maps message J and the side information Y2 into the estimated sequence Xg ; and two 
reconstruction functions 

V'l: A"^ ^ X^ (7a) 
and 11)2. X"" ^ X^, (7b) 
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which map the source sequence into the estimated sequences at the encoder, namely ^i(X") 
and 7/12 (-^"), respectively; such that the distortion constraints are satisfied, i.e.. 



i=l 



and the CR requirements hold, namely. 



Pr 



< A for J = 1,2, (8) 



< e, J = 1,2. (9) 



Given distortion pairs {Di,D2), a rate pair R is said to be achievable if, for any e > and 
sufficiently large n, there exists an {n, R, Di + e, D2 + e, e) code. The rate-distortion function 
R{Di,D2) is defined as R{Di,D2) =inf{i? : the triple {R, Di, D2) is achievable}. 

B. Rate -Distortion Function 

In this section, a single-letter characterization of the rate-distortion function for the HB problem 
with CR is derived, under the assumption that the joint pmf p(x, yi, 1/2) is such that there exists 
a conditional pmf p{yi\y2) for which 

p{x,yi) = ^p{x,y2)p{yi\y2)- (10) 

y2&y2 

In other words, the side information Yi is a stochastically degraded version of Y2. 

Proposition 1. If the side information Yi is stochastically degraded with respect to Y2, the 
rate-distortion function for the HB problem with CR is given by 

i?gf(Di,D2) = minJ(X;Xi|Fi) + /(X;X2|F2Xi) (11) 

where the mutual information terms are evaluated with respect to the joint pmf 

p{x, yi, 1/2, xi, X2) = pix, yi, y2)pixi, X2\x), (12) 

and minimization is performed with respect to the conditional pmf p{xi,X2\x) under the con- 
straints 

E[d,{X,X,)]<Dj, for J = 1,2. (13) 

The proof of the converse can be found in Appendix A. Achievability follows as a special 
case of Theorem 3 of and can be easily shown using standard arguments. In particular. 
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the encoder randomly generates a standard lossy source code X" for the source X" with rate 
/(X;Xi) bits per source symbol. Random binning is used to reduce the rate to /(X;Xi|Y'i). 
By the Wyner-Ziv theorem [fT4l p. 280], this guarantees that both Decoder 1 and Decoder 2 are 
able to recover X" (since Yi is a degraded version of Y2). The encoder then maps the source X" 
into the reconstruction sequence X2 using a codebook that is generated conditional on X" with 
rate /(X;X2|Xi) bits per source symbol. Random binning is again used to reduce the rate to 
/(X; X2IF2X1). From the Wyner-Ziv theorem, and the fact that Decoder 2 knows the sequence 
X", it follows that Decoder 2 can recover the reconstruction Xg as well. Note that, since the 
reconstruction sequences X" and Xg are generated by the encoder, functions ^1 and ip2 that 
guarantees the CR constraints dH) exist by construction. 

Remark 1. Under the physical degradedness assumption that the Markov chain condition X — Y2 — Yi 
holds, equation (fTT)) can be rewritten as 

i? = min/(X;XiX2|F2) + /(Xi;F2|n), (14) 

with the minimization defined as in (fTTI) . This expression quantifies by /(Xi; 1^2|^i) the additional 
rate that is required with respect to the ideal case in which both decoders have the better side 
information Y2. 

Remark 2. If we remove the CR constraint, then the rate-distortion function under the assumption 
of Proposition [T] is given by ^ 

Rhb{Di,D2) = min/(X;f/i|yi) + /(X;f/2|F2f/i), (15) 
where the mutual information terms are evaluated with respect to the joint pmf 

P{X, yi, 2/2, Ul, U2, Xi, X2) = pix, yi, ?/2)p(Mi, U2\x)5{Xi - Xi(ui, yi))5ix2 - X2(U2, ?/2)), (16) 

and minimization is performed with respect to the conditional pmf p{ui,U2\x) and the deter- 
ministic functions Xj{uj,yj), for j = 1,2, such that distortion constraints (fT3]) are satisfied. 
Comparison of (fTTI) with (fT5l) reveals that, similar to the discussion around ([T]) and the CR 
constraint permits the use of side information only to reduce the rate via binning, but not to 
improve the decoder's estimates via the use of the auxiliary codebooks represented by variables 
Ul and U2, and functions Xj(uj, yj), for j = 1, 2, in (fT6l) . 
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Remark 3. Consider the case in which the side information sequences are available in a causal 
fashion in the sense of [|T6l . that is, the decoding functions Q-® are modified as hji: [1, 2"^] x 
— > Xji, for i E [l,n] and j = 1,2, respectively. Following similar steps as in the proof of 
Proposition 2 and in |fT6ll , it can be concluded that, under the CR constraint, the rate-distortion 
function in this case is the same as if the two side information sequences were not available at 
the decoders, and is thus given by ([TT]) upon removing the conditioning on the side information. 
Note that this is true irrespective of the joint pmf p{x,yi,y2) and hence it holds also for non- 
degraded side information. This result can be explained by noting that, as explained in [16 ], 
causal side information prevents the possibility of reducing the rate via binning. Since the CR 
constraint also prevents the side information from being used to improve the decoders' estimates, 
it follows that the side information is useless in terms of rate-distortion performance, if used 
causally under the CR constraint. 

On a similar note, if only side information Yi is causally available, while Y2 can still be used 
in the conventional non-causal fashion, then it can be proved that Yi can be neglected without 
loss of optimality. Therefore, the rate-distortion function follows from (fTTI) by removing the 
conditioning on Yi. 

Remark 4. In [fT9ll , a related model is studied in which the source is given as X = (Yi,Y2) 
and each decoder is interested in reconstructing a lossy version of the side information available 
at the other decoder. The CR constraint is imposed in a different way by requiring that each 
decoder be able to reproduce the estimate reconstructed at the other decoder. 

C. Gaussian Sources and Quadratic Distortion 

In this section, we highlight the result of Proposition \T\ by considering a zero-mean Gaussian 
source X ~ A/'(0, cr^), with side information variables 

Fi = X + Zi (17a) 

andF2 = X + Z2, (17b) 

where Zi ~ J\f(0, Ni + N2) and Z2 ^ J\f{0, N2) are independent of each other and of Y2 and 
X. Note that the joint distribution of {X, Yi, Y2) satisfies the stochastic degradedness condition. 
We focus on the quadratic distortion dj{ ) for j = 1,2. By leveraging standard 

arguments that allow us to apply Proposition [U to Gaussian sources under mean-square-error 
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D 




Figure 5. Illustration of the distortion regions in the rate-distortion function ( 119b for Gaussian sources and quadratic distortion. 



constraint (see [[HI pp. 50-51] and ifTSl '). we obtain a characterization of the rate-distortion 
function for the given distortion and metrics. 

We first recall that for the point-to-point set-up in Fig. [U with X ~ A/'(0, a1) and side 
information Y = X + Z, with Z ~ A/'(0, A^) independent of X, the rate-distortion function 
with CR under quadratic distortion is given by [O 

i?HAiV)^ilog,(^.^) fo. D<al 



(18) 



for D > a% 

where we have made explicit dependence on N of function Rg^{D,N) for convenience. The 
rate-distortion function (fTSi) for D < olh obtained from ^ by choosing the distribution p{x\x) 
such that X = X + Q where Q ~ A/'(0, D) is independent of X. 

Proposition 2. The rate-distortion function for the HB problem with CR for Gaussian sources 
4771) and quadratic distortion is given by 





i?g^p2,iV2) 



if Z^i > and D2 > al 

if Di < al and 1^2 > min(Di, a^) 

if /^i> (yl and < 

if D2<Di< al 



(19) 
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where Rg^{D, N) is defined in HM and 

{D, + N, + N,){D^ + N^) \ 
{al + N^ + N2) {D, + N2)D2 J' 

Remark 5. The rate-distortion function for the HB problem for Gaussian sources (flTI) without 
the CR constraint can be found in [6J. Comparison with (fT9l) confirms the performance loss 
discussed in Remark [2l 

Definition of the rate distortion function (fT9l ) requires different consideration for the four 
subregions of the (Di,D2) plane sketched in Fig. [5l In fact, for Di > al and D2 > cr^, the 
required rate is zero, since the distortion constraints are trivially met by setting Xi = X2 = 
in the achievable rate (fTTI) . For the case Di > al and D2 < crl, it is sufficient to cater only to 
Decoder 2 by setting Xi = and X = X2 + Q2, with Q2 ~ A/'(0, D2) independent of X2, in the 
achievable rate ([TT)) . That this rate cannot be improved upon follows from the trivial converse 

R^'hUDi, D2) > max{i?^^(Di, N, + N2), /?^^(/^2, N2)}, (21) 

which follows by cut-set arguments. The same converse suffices also for the regime Di < 
al and D2 > min(Di,cr^). For this case, achievability follows by setting X = Xi + Qi and 
Xi = X2 in (fTTj) . where Qi ~ Af{0,Di) is independent of Xi. In the remaining case, namely 
D2 < Di < a^, the rate-distortion function does not follow from the point-to-point result (fTSi) 
as for the regimes discussed thus far. The analysis of this case requires use of entropy-power 
inequality (EPI) and can be found in Appendix B 

Fig. [6] depicts the rate R^^{Di, D2) in (fT9l ) versus Di for different values of D2 with al = 4, 
Ni = 2, and A^2 = 3. As discussed above, for D2 = 5, which is larger than a^, R'^^{Di, D2) 
becomes zero for values of Di larger than a^ = 4, while this is not the case for values D2 < 



hCR 



{Di,D2 



A 1, 



D. Binary Source with Erased Side Information and Hamming or Erasure Distortion 

In this section, we consider a binary source X ~ Ber(|) with erased side information 
sequences Yi and Y2. The source Y2 is an erased version of the source X with erasure probability 
P2 and Yi is an erased version of X with erasure probability pi > P2- This means that Yj = e, 
where e represents an erasure, with probability pj and Yj = X with probability 1 — Pj. Note 
that, with these assumptions, the side information Yi is stochastically degraded with respect to 
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Figure 6. The rate-distortion function i?§^(_Di, D2) in l |19l l versus distortion Di for different values of distortion D2 and for 
al =4,Ni= 2, and N2 = 3. 



Y2. In fact, we have the factorization (flOl) . where additional distributions p(y2|a;) and p{yi\y2) 
are illustrated in Fig. |71 As seen in Fig. Ul the pmf p{yi\y2) is characterized by the probability 
pi that satisfies the equality pi = P2+Pi(l— ^2)- We focus on Hamming and erasure distortions. 
For the Hamming distortion, the reconstruction alphabets are binary, Xi = X2 = {0, 1}, and we 
have dj(x,Xj) = if x = xj and dj{x,Xj) = 1 otherwise for j = 1,2. Instead, for the erasure 
distortion the reconstruction alphabets are Xi = X2 = {0,l,e}, and we have for j = 1,2: 



for Xj = X 

1 for Xj = e (22) 
00 otherwise 



In Appendix C, we prove that for the point-to-point set-up in Fig. [T] with X ~ Ber(|) and 
erased side information Y, with erasure probability p, the rate-distortion function with CR under 
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Figure 7. Illustration of the pmfs in the factorization l llOt of the joint distribution p{x, yi, 1/2) for a binary source X and erased 
side information sequences (Yi,y2)- 



Hamming distortion is given by 



rR . . I R^^{D,p) =p{l- H{D)) for D< 1/2 

for D> 1/2, 

where we have made explicit the dependence on p of function R'^^(D,p) for convenience. The 
rate-distortion function (l23l) for D < 1/2 is obtained from ^ by choosing the distribution p{x\x) 
such that X = X Q) Q where Q ~ Ber{D) is independent of X. Following the same steps as in 
Appendix C, it can be also proved that for the point-to-point set-up in Fig. [T] with X ~ Ber(i) 
and erased side information Y, with erasure probability p, the rate-distortion function with CR 
under erasure distortion is given by 

RScfviD) = R%f{D,p)=p{l-D). (24) 

The rate-distortion function (|24|) is obtained from ^ by choosing the distribution p{x\x) such 
that X = X with probability 1 — D and X = e with probability D. 

Remark 6. The rate-distortion function with erased side information and Hamming distortion 
without the CR constraint is derived in [17] (see also [fT8l ). Comparison with (|23l ) shows again 
the limitation imposed by the CR constraint on the use of side information (see Remark |2l). 

Proposition 3. The rate-distortion function for the HB problem with CR for the binary source 
with the stochastically degraded erased side information sequences illustrated in Fig. [7| under 
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Figure 8. Illustration of the distortion regions in the rate-distortion function l l25b for a binary source with degraded erased side 
information and Hamming distortion. 



Hamming distortion is given by 

' 



if Di > 1/2 and D2 > 1/2, 

if A < 1/2 and D2 > min(L)i, 1/2) 

if Di > 1/2 and D2 < 1/2 



(25) 



[ R^UDi,D2) ifD2<Di<l/2 
where Rq^{D, N) is defined in d23l) and 



(A, D2) = Pi(l - HiD,)) + p2iH{Di) - H{D2)). 



A 



(26) 



Moreover, for the same source under erasure distortion the rate-distortion function is given 
by d25l) by substituting R^^{Dj,pj) with RQ^{Dj,pj) as defined in ^24\i for j = 1,2 and by 
substituting ^26\l with 

A 



f}CR 
^HB,E 



(A, D2) =Pi{l- D,) + - D2) 



(27) 



Similar to the Gaussian example, the characterization of the rate distortion function (1251) 
requires different considerations for the four subregions of the {Di,D2) plane sketched in Fig. 
[8l In fact, for Di > 1/2 and D2 > 1/2, the required rate is zero, since the distortion constraints 
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are trivially met by setting Xi = X2 = in the achievable rate (fTTI) . For the case Di > 
1/2 and D2 < 1/2, it is sufficient to cater only to Decoder 2 by setting Xi = and X = X2®Q2, 
with Q2 ~ Ber(D2) independent of X, in the achievable rate (fTTI) . That this rate cannot be 
improved upon is a consequence from the trivial converse 

R%l{D^,D2) > max{i?g^(Di,pi),i?g^(D2,P2)}, (28) 

which follows by cut-set arguments. The same converse suffices also for the regime Di < 
1/2 and D2 > min(Di, 1/2). For this case, achievability follows by setting X = Xi ® Qi and 
Xi = X2 in (fTTj) . where Qi ~ Ber(Di) is independent of Xi. In the remaining case, namely 
D2 < Di < 1/2, the rate-distortion function does not follow from the point-to-point result (l23l) 
as for the regimes discussed thus far. The analysis of this case can be found in Appendix D. 
Similar arguments apply also for the erasure distortion metric. 

We now compare the rate-distortion function for the binary source X ~ Ber(i) with erased side 
information under Hamming distortion for three settings. In the first setting, known as the Kaspi 
model [|71, the encoder knows the side information, and thus the position of the erasures. For this 
case, the rate-distortion function R^aspiiDi, D2) for the example at hand was calculated in [17J. 
Note that in the Kaspi model, the CR constraint does not affect the rate-distortion performance 
since the encoder has all the information available at the decoders. The second model of interest 
is the standard HE setting with no CR constraint, whose rate-distortion function Rhb{Di, D2) 
for the example at hand was derived [[T2|. The third model is the HB setup with CR studied 
here. We clearly have the inequalities 

RKaspi{Di, D2) < Rhb{Di, D2) < RSiUDi, D2), (29) 

where the first inequality in (|29l ) accounts for the impact of the availability of the side information 
at the encoder, while the second reflects the potential performance loss due to the CR constraint. 

Fig. m shows the aforementioned rate-distortion functions with pi = 1 and p2 = 0.35, which 
corresponds to the case where Decoder 1 has no side information, for two values of the distortion 
D2 versus the distortion Di. For -D2 > ^ = 0.175, the given settings reduce to one in which 
the encoder needs to communicate information only to Decoder 1. Since Decoder 1 has no 
side information, the Kaspi and HB settings yield equal performance i.e., RxaspiiDi, D2) = 
RHsiDi, D2). Moreover, if Di is sufficiently smaller than D2, the operation of the encoder is 
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limited by the distortion requirements of Decoder 1 . In this case, Decoder 2 can in fact reconstruct 
as Xi = X2 while still satisfying its distortion constraints. Therefore, we obtain the same 
performance in all of the three settings, i.e., RxaspiiDi, D2) = Rhb{Di, D2) = R^^{Di,D2). 
We also note the general performance loss due to the CR constraint, unless, as discussed above, 
distortion Di is sufficiently smaller than D2- 

III. Heegard-Berger Problem with Cooperative Decoders 

The system model for the HB problem with CR and decoder cooperation is similar to the 
one provided in Sec. III-AI with the following differences. Here, in addition to encoding function 
given in dH) which maps the source sequence X" into a message Ji of nRi bits, there is an 
encoder at Decoder 1 given by 

g^: [l,2'^^^]x:yf ^[1,2"«^], (30) 
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which maps message Ji and the source sequence F" into a message J2. Moreover, instead of 
the decoding function given in ([5]), we have the decoding function for Decoder 2 

h2. [1, 2"-^!] X [1, 2"-^^] X yi^ ^ X^, (31) 

which maps the messages Ji and J2 and the side information Y2 into the estimated sequence 

A. Rate-Distortion Region for X — Yi — Y2 

In this section, a single-letter characterization of the rate-distortion region is derived under 
the assumption that the joint pmf p{x, yi, 1/2) is such that the Markov chain X — Y1—Y2 holdJl 

Proposition 4. The rate-distortion region TZ'^^{Di, D2) for the HB source coding problem with 
CR and cooperative decoders under the assumption X — Yi — Y2 is given by the union of all 
rate pairs {Ri,R2) that satisfy the conditions 

> /(X;XiX2|Fi) (32a) 
andRi + R2 > /(X; XalFa) + /(X; Xiin, X2), (32b) 
where the mutual information terms are evaluated with respect to the joint pmf 

p{x, yu 1/2, xi, X2) = pix, yi)p{y2\yi)p{,xu X2\x), (33) 
for some pmf p{xi,X2\x) such that the constraints Qj} are satisfied. 

The proof of the converse can be easily established following cut-set arguments for bound 
(I32al) . while the bound (I32bl) on the sum-rate -Ri + R2 can be proved following the same 
step as in Appendix A and substituting J with (Ji, J2)- As for the achievability, it follows 
as a straightforward extension of fS^, Sec. Ill] to the setup at hand where Decoder 2 has side 
information as well. It is worth emphasizing that the reconstruction X2 for the Decoder 2, which 
has degraded side information, is conveyed by using both the direct link from the Encoder, of 

'Note that, unlike the conventional HB problem studied in Sec. [Ill the rate-distortion region with cooperative decoders depends 
on the joint distribution of the variables (Yi, Y2), and thus stochastic and physical degradedness of the side information sequences 
lead to different results. 
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rate -Ri, and the path Encoder-Decoder 1 -Decoder 2. The latter path leverages the the better side 
information at Decoder 1 and the cooperative link of rate i?2- 

Remark 1. If we remove the CR constraint, the problem of determining the rate-distortion region 
for the setting of Fig. [3] under the assumption X — Fi — F2 is still open. In [8J, inner and outer 
bounds are obtained to the rate distortion region, for the case which the side information Y2 is 
absent. The bounds were shown to coincide for the case where Decoder 1 wishes to recover X 
losslessly (i.e., Di = 0) and also for certain distortion regimes in the quadratic Gaussian case. 
Moreover, the rate distortion tradeoff is completely characterized in [jSl for the case in which 
the encoder also has access to the side information. We note that, as per the discussion in Sec. 
III-D[ these latter result immediately carry over to the case with CR constraint since the encoder 
is informed about the side information. 

Remark 8. To understand why imposing the CR constraint simplifies the problem of obtaining 
a single-letter characterization of the rate-distortion function, let us consider the degrees of 
freedom available at Decoder 1 in Fig. 3 for the use of the link of rate R2. In general. Decoder 
1 can follow two possible strategies: the first is forwarding, whereby Decoder 1 simply forwards 
some of the bits received from the encoder to Decoder 2; while the second is recompression, 
whereby the data received from the encoder is combined with the available side information F", 
compressed to at most R2 bits per symbol, and then sent to Decoder 2. It is the interplay and 
contrast between these two strategies that makes the general problem hard to solve. In particular, 
while the strategies of forwarding/recompression and combinations thereof appear to be natural 
candidates for the problem, finding a matching converse when both such degrees of freedom are 
permissible at the decoder is difficult (see, e.g., [|20l '). However, under the CR constraint, the 
strategy of recompression becomes irrelevant, since any information about the side information 
that is not also available at the encoder cannot be leveraged by Decoder 2 without violating 
the CR constraint. This restriction in the set of available strategies for Decoder 1 makes the 
problem easier to address under the CR constraint." 

B. Rate-Distortion Region for X — Y2 — Yi 

In this section, a single-letter characterization of the rate-distortion region is derived under the 
assumption that the joint pmf p{x, yi, 1/2) is such that the Markov chain relationship X — Y2 — Y1 
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holds. 

Proposition 5. The rate-distortion region TZ^^{Di, D2) for the HB source coding problem with 
CR and cooperative decoders under the assumption the Markov chain relationship X — Y2 — Y1 
is given by the union of all rate pairs (-Ri,i?2) that satisfy the conditions 

i?i > I{X;X^\Y^) + I{X;X2\Y2,Xi) (34a) 
and R2 > 0, (34b) 

where the mutual information terms are evaluated with respect to the joint pmf 

p{x,yi,y2,xi,X2) = p{x,y2)piyi\y2)pixi,X2\x), (35) 
for some pmf p{xi,X2\x) such that the constraints ([72]) are satisfied. 

The proof of achievability follows immediately by neglecting the link of rate R2 and using rate 
Ri as per the HB scheme of Proposition [U The converse follows by considering an enhanced 
system in which Decoder 2 is provided with the side information of Decoder 1. In this system, 
link i?2 becomes useless since Decoder 2 possesses all the information available at Decoder 
1. It follows that the system reduces to the HB problem with degraded sources studied in the 
previous section and the bound (I34al) follows immediately from Proposition [IJ 

Remark 9. In the case without CR, the rate-distortion function is given similarly to (|34|) . but with 
the HB rate-distortion function (fT5l ) in lieu of the rate-distortion function of the HB problem 
with CR in (l34al) . 

IV. Cascade Source Coding with Common Reconstruction 

In this section, we first detail the system model in Fig. |4] of cascade source coding with CR. 
As mentioned in Sec. I, the motivation for studying this class of models comes from multi- 
hop applications. Next, the characterization of the corresponding rate-distortion performance is 
presented under the assumption that one of the two side information sequences is a degraded 
version of the other. Finally, following the previous section, three specific examples are worked 
out, namely Gaussian sources under quadratic distortion (Sec. IIV-CII) . and binary sources with 
side information subject to erasures under Hamming or erasure distortion (Sec. IIV-C2I) . 
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A. System model 

In this section, the system model for the cascade source coding problem with CR is detailed 
similar to Sec. III-AI The problem is defined by the pmf pxYiY2ix, Z/i; Z/2) and discrete alphabets 

5 3^1, 3^2, '^'i 5 and X2 as follows. The source sequence X" and side information sequences 
Fi" and Y^, with X" G A"", e y^, and e are such that the tuples (X^, Yu, Fsi) for 
i G [1, n] are i.i.d. with joint pmf pxYiY2i^j Ui^ Z/2)- Node 1 measures a sequence X" and encodes 
it into a message Ji of ?7,i?i bits, which is delivered to Node 2. Node 2 estimates a sequence 
Xj^ G A*" within given distortion requirements. Node 2 also encodes the message Ji received 
from Node 1 and the sequence into a message J2 of r2i?2 bits, which is delivered to Node 
3. Node 3 estimates a sequence X2 ^ ^^2" within given distortion requirements. Distortion and 
CR requirements are defined as in Sec. III-A[ leading to the following definition. 

Definition 2. An (n, Ri, R2, Di, D2, e) code for the cascade source coding problem with CR 
consists an encoding function for Node 1, 

g,: X- ^ [1,2-^% (36) 

which maps the source sequence X" into a message Ji, an encoding function for Node 2, 

g2: [l,2"^^]x3;f ^[1,2"^^], (37) 

which maps the source sequence F" and message Ji into a message J2; a decoding function 
for Node 2 

h: [1,2"^^] X y^ X^, (38) 

which maps message Ji and the side information Y^" into the estimated sequence X^; a decoding 
function for Node 3 

h2: [1,2"^2] X y^ X^, (39) 

which maps message J2 and the side information Y2 into the estimated sequence X2 ; two 
encoder reconstruction functions as in (|7]), which map the source sequence into estimated 
sequences V^i(X") and V^2(-^") at Node 1; such that the distortion constraints (O and ^ are 
satisfied. 

Given a distortion pair (Di, D2), a rate pair (_Ri, R2) is said to be achievable if, for any e > 
and sufficiently large n, there a exists an (n, Ri, R2, Di + e, D2 + e, e) code. The rate-distortion 
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region 7l{Di,D2) is defined as the closure of all rate pairs (i?i,i?2) that are achievable given 
the distortion pair {Di,D2). 

B. Rate-Distortion Region for X — Y1 — Y2 

In this section, a single-letter characterization of the rate-distortion region is derived under the 
assumption that the joint pmf p(a;, 1/1,2/2) is such that the Markov chain relationship X — Y1—Y2 
holds t 

Proposition 6. The rate-distortion region 7l'~^^{Di, D2) for the cascade source coding problem 
with CR is given by the union of all rate pairs (-Ri,i?2) that satisfy the conditions 

Ri > IiX;X,X2\Yi) (40a) 
andR2 > I{X;X2\Y2), (40b) 

where the mutual information terms are evaluated with respect to the joint pmf 

2/1, 2/2, 5)1, £2) = p{,x,yi)p{y2\yi)p{xi,X2\x), (41) 
for some pmf p{xi,X2\x) such that the constraints (O are satisfied. 

The proof of the converse is easily established following cut-set arguments. To prove achiev- 
ability, it is sufficient to consider a scheme based on binning at Node 1 and decode and rebin 
at Node 2 (see HII). Specifically, Node 1 randomly generates a standard lossy source code X" 
for the source X" with rate /(X; Xi) bits per source symbol. Random binning is used to reduce 
the rate to /(X;Xi|Yi). Node 1 then maps the source X"' into the reconstruction sequence Xg 
using a codebook that is generated conditional on X" with rate /(X;X2|Xi) bits per source 
symbol. Using the side information F" available at Node 2, random binning is again used to 
reduce the rate to /(X; X2|yiXi). The codebook of X2 is also randomly binned to the rate 
/(X;X2|^2)- Node 2, having recovered Xg, forwards the corresponding bin index to Node 3. 
The latter, by choice of the binning rate, is able to obtain X2 . Note that, since the reconstruction 

''As for the HB problem with cooperative decoders studied in Sec. |lll] the rate-distortion region of the cascade source coding 
problem depends on the joint distribution of the variables (Yi,Y2), and thus stochastic and physical degradedness of the side 
information sequences lead to different results. 
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sequences X" and are generated by the encoder, functions ijji and ^2 that guarantees the 
CR constraints Q exist by construction. 

Remark 10. If we remove the CR constraint, the problem of determining the rate-distortion 
region for the setting of Fig. |4] under the Markov condition X — Yi — Y2 is still open. In the 
special case in which Yi = Y2 the problem has been solved in [10] for Gaussian sources under 
quadratic distortion and in lfT2]| for binary sources with erased side information under Hamming 
distortion. 

Remark 11. Following Remark [3l if both side information sequences are causal, it can be shown 
that they have no impact on the rate-distortion function ( |40l) . Therefore, the rate-distortion region 
follows immediately from the results in (l40l) by removing both of the side information terms. 
Note that with causal side information sequences the rate-distortion function holds for any joint 
pmf p{x,yi,y2) with no degradedness requirements. Moreover, if only the side information Y2 
is causal, while Yi is still observed non-causally, then the side information Y2 can be neglected 
without loss of optimality, and the rate-distortion region follows from (|40|) by removing the 
conditioning on 1^2- 

C. Bounds on the Rate-Distortion Region for X — Y2 — Y1 

In this section, outer and inner bounds are derived for the rate-distortion region under the 
assumption that the joint pmf p{x, yi, 1/2) is such that the Markov chain relationship X — Y2 — Y1 
holds. The bounds are then shown to coincide in Sec. IIV-CII for Gaussian sources and in Sec. 
IIV-C2I for binary sources with erased side information. 

Proposition 7. (Outer bound) The rate-distortion region lZ'~"^{Di, D2) for the cascade source 
coding problem with CR is contained in the region 7l'^'^{Di, D2), which is given by the set of 
all rate pairs (-Ri,i?2) that satisfy the conditions 



the minimization is performed with respect to the conditional pmf p{x2\x) under the distortion 
constraints f l7i]) for j = 2. 



Ri > RijUDuD2) 
andR2 > i?^fy,p2), 
where R'^^{Di, D2) is defined in 477]) and we have Rx^. {.D2) 



min I{X\X2\Y2)., where 



(42b) 



(42a) 
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Proposition 8. (Inner bound) The rate-distortion region TZ^^{Di^ D2) for the cascade source 
coding problem with CR contains the region 7lf^{Di, D2), which is given by the union of all 
rate pairs (-Ri,i?2) that satisfy the conditions 

Ri > I{X;Xi\Yi) + I{X;X2\Y2Xi) (43a) 

andR2 > IiX;Xi\Y2) + I{X;X2\XiY2) (43b) 

= I{X;XiX2\Y2) (43c) 
where the mutual information terms are evaluated with respect to the joint pmf 

Pix,yi,y2,xi,X2) = pix,y2)piyi\y2)pixi,X2\x), (44) 
for some pmf p{xi,X2\xi) such that the distortion constraints [TM are satisfied. 

The outer bound in Proposition |7] follows immediately from cut-set arguments similar to those 
in IfTOll and [[T2ll . As for the inner bound of Proposition 19, the strategy works as follows. Node 
1 sends the description X" to Node 2 using binning with rate /(X;Xi|Yi). It also maps the 
sequence X" into the sequence using a conditional codebook with respect to X", which 
is binned in order to leverage the side information Y2 at Node 3 with rate /(X; X2IX1, ^2)- 
Node 2 recovers X", whose codebook is then binned to rate /(X;Xi|F2)- Then, it forwards 
the so obtained bin index for X" and the bin index for the codebook of Xg produced by Node 
1 to Node 3. By the choice of the rates, the latter can recover both X" and X^ . Since both 
descriptions are produced by Node 1, the CR constraint is automatically satisfied. 

The inner and outer bounds defined above do not coincide in general. However, in the next 
sections, we provide two examples in which they coincide and thus characterize the rate-distortion 
region of the corresponding settings. 

Remark 12. Without the CR constraint, the problem of deriving the rate-distortion region for the 
setting at hand under the Markov chain condition X — F2 — is open. The problem has been 
solved in [flOll for Gaussian sources under quadratic distortion and in [|T2| for binary sources 
with erased side information under Hamming distortion for Yi = Y2. 

1) Gaussian Sources and Quadratic Distortion: In this section, we assume the Gaussian 
sources in (fTTI) and the quadratic distortion as in Sec III-C[ and derive the rate-distortion region 
for the cascade source coding problem with CR. 
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Proposition 9. The rate-distortion region lZ^^{Di, D2) for the cascade source coding problem 
with CR for the Gaussian sources in 4771) and quadratic distortion is given by P2l) with 



The proof is given in Appendix E. 

2) Binary Sources with Erased Side Information and Hamming Distortion: In this section, 
we assume the binary sources in Fig. |7] and the Hamming distortion as in Sec III-Dl and derive 
the rate-distortion region for the cascade source coding problem with CR. 

Proposition 10. The rate-distortion region TZ'-''^{Di, D2) for the cascade source coding prob- 
lem with CR for the binary sources in Fig. [7| and Hamming distortion is given by i[42\) with 



The proof is given in Appendix F. 

V. Heegard-Berger Problem with Constrained Reconstruction 

In this section, we revisit the HB problem and relax the CR constraint to the ConR constraint 
of [13]. This implies that we still adopt the code as per Definition 1, but we substitute dH) with 
the less stringent constraint 



where dej{xj,Xej)'. Xj x Xj — )■ [0, De.max] is a per-symbol distortion metric and we have used 
for j = 1, 2, to denote the ith letter of the vector = ?/;j„(X")). 

Definition 3. Given a distortion tuple (DeA, De^2, Di, D2), a rate R is said to be achievable if, 
for any e > and sufficiently large n, there a exists an (n, R, De,i + e, De,2 + e, -Di + e, D2 + e, e) 
code. The rate-distortion function R(De,i, De,2, Di, D2) is defined as R(De^i, De,2, Di, D2) = 
mf{R: the tuple (De_i, De,2, -Di, -D2) is achievable}. 

Note that, by setting Dej = for j = 1,2, and letting dej{xj, Xej) be the Hamming distortion 
metric (i.e., de,j{xj, Xej) = I if x ^ Xj and dej{xj, Xej) = if x = Xj), we obtain a relaxed CR 
constraint in which the average per-symbol, rather than per-block, error probability criterion is 
adopted. 
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i?g5(A,/^2) in m and R^cfy^W = i?g^(I^2, A^2) (see ([H). 



R^'^{D,,D2) in m and R^fy(D2) = i?g^(/^2,P2) (see m)- 




(45) 
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Remark 13. The problem at hand reduces to the one studied in [ITBII by setting Di = D^ax and 

-^6,1 D^uiax- 

Proposition 11. If the side information Yi is stochastically degraded with respect to Y2, the 
rate-distortion function for the HB problem with ConR is given by 

R''HTiDe,i,D,,2,D,,D2) = minJ(X;f/i|Fi) + /(X;[/2|Wi) (46a) 

= mmI{X-UiU2\Y2) + I{Ui-Y2\Yi), (46b) 
where the mutual information terms are evaluated with respect to the joint pmf 

pix,yi,y2,ui,U2) = p{x,yi,y2)piui,U2\x), (47) 

and minimization is performed with respect to the conditional pmfp{ui, U2\x) and the determin- 
istic functions Xj{uj,yj): Uj X Xj and Xcj('Uj,x): Uj x X X^j for j = 1,2, such that 
the distortion constraints E[dj{X,Xj{Uj,Yj))] < Dj for j = 1,2, and the ConR requirements 

E[4j (x, (?7„ 1^), Xe,,(f/„ X))] < Dej, for J = 1, 2, (48) 

are satisfied. Finally, {Ui, U2) are auxiliary random variables whose alphabet cardinalities can 
be constrained as \Ui\ < \X\ + 4 and IW2I < {\X\ +2)^. 

The proof is given in Appendix G. 
Remark 14. Proposition \TT\ reduces to [TT, Theorem 2] when setting Di = D^ax and Dg^i = 

De^max- 

Remark 15. Similar to [13, Theorem 2], it can be proved that, by setting De i = -De,2 = 
and letting dej be the Hamming distortion for j = 1, 2, the rate-distortion function (l46l) . 
Rmi^{^^ 0, Di, D2), reduces to the rate-distortion function with CR ([TT]) . 

Remark 16. Similar to Remark [T5l if De,i = and De,2 = De,max, the rate-distortion function 
(l46l) is given by 

R%'^{0,De,max,Di,D2) = mmI{X;Xi\Y,) + I{X;U2\Y2Xi), (49) 
where the mutual information terms are evaluated with respect to the joint pmf 

p(a;,2/i,2/2,M2,£i) =p(x,?/i,2/2)p(xi,M2k), (50) 
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and minimization is performed with respect to the conditional pmf p{xi, U2\x) and the determin- 
istic functions X2{u2, 2/2): x 3^2 — ^ '^2 and Xc,2(w2, x): W2 x A* — )■ A'g 2, such that the distortion 
constraints E[(ii(X, Xi)] < Di and E[(i2(X, X2(f/2, 1^2))] < -D2 and the ConR requirement 
E[(ie,2(x2(f^25 ^2), a;e,2(f^25 -^))] < -De,2 ^rc Satisfied. It can be proved that this is also the rate- 
distortion function under the partial CR requirement that there exists a function such 
that dH) holds for j = I only. Similar conclusions apply symmetrically to the case where CR 
and ConR requirements are imposed only on the reconstruction of Decoder 2. 

Remark 17. If both side information sequences are causally available at the decoders, it can 
be proved that they have no impact on the rate-distortion function (|46l) . In this case, the rate- 
distortion function follows immediately from the results in (|46|) by removing conditioning on 
both side information sequences. Moreover, the result can be simplified by introducing a single 
auxiliary random variable. Similarly, if only side information Yi is causal, then it can be neglected 
with no loss of optimality, and the results follow from (|46l) by removing the conditioning on Yi. 

Remark 18. We note that the ConR formulation studied in this section is more general than the 
conventional formulation with distortion constraints for the decoders only. Therefore, problems 
that are open with the conventional formulation, such as HB with cooperative decoders (Sec. 
Hn)) and cascade source coding (Sec. HV]) . are a fortiori also open in the ConR set-up. 

VI. Concluding Remarks 

The Common Reconstruction requirement and its generalization in [fT3]| . substantially 
modify the problem of source coding in the presence of side information at the decoders. From 
a practical standpoint, in various applications, such as transmission of medical records, CR is 
a design constraint. In these cases, evaluation of the rate-distortion performance under CR thus 
reveals the cost, in terms of transmission resources, associated with this additional requirement. 
From a theoretical perspective, adding the CR constraint to standard source coding problems with 
decoder side information proves instrumental in concluding about the optimality of various known 
strategies in settings in which the more general problem, without the CR constraint, is open [5J. 
This paper has extended these considerations from a point-to-point setting to three baseline 
multiterminal settings, namely the Heegard-Berger problem, the HB problem with cooperating 
decoders and the cascade problems. The optimal rate-distortion trade-off has been derived in a 
number of cases and explicitly evaluated in various examples. 
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A general subject of theoretical interest is identifying those models for which the CR re- 
quirements enables a solution of problems that have otherwise resisted solutions for decades. 
Examples include the Heegard-Berger and cascade source coding problems with no assumptions 
on side information degradedness and the one-helper lossy source coding problem. 

Appendix A: Proof of Proposition [H 

We first observe that from Definition [H since distortion and CR constraints ^ and ^ depend 
only on the marginal pmfs p{x,yi) and 1/2)5 so does the rate-distortion function. Therefore, 
in the proof, we can assume, without loss of generality, that the joint pmf p{x,yi,y2) satisfies 
the Markov chain condition X — Y2 — Yi so that it factorizes as (cf. (fTOl) ) 

Pix,yi,y2) = Pix,y2)piyi\y2)- (51) 

Consider an {n, R, Di + e, Z^2 + e) code, whose existence is required for achievability by 
Definition [U By the CR requirements (|9l), we first observe that we have the Fano inequalities 

H{i,,{X-)\h,{g{X^),Y;^)) < n6ie), for j = 1,2, (52) 

for n sufficiently large, where 5{e) = ne\og\X\ + Hb{e). Moreover, we can write 

nR = H{J) > H{J\Y^^) (53a) 

H{J\Y;'Y^) + I{J; F2"l>?), (53b) 

where (a) follows by the definition of mutual information. From now on, to simplify notation, 
we do not make explicit the dependence of 'ipj, gj and hj on X" and (J, Y^), respectively. We 
also define ipji as the ith symbol of the sequence ipj so that ijjj = {ipji, ...^ipjn). 

The first term in (|53bl) . H{J\Y{'Y2), can be treated as in ^ Sec. V.A.], or, more simply, we 
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can proceed as follows: 

if(J|17F2") - (54a) 

> /(/ii/i2;X"|Fi"y2") (54b) 

= i{hh2A^2; X"|yi"r2") - x^\Y^Y:^h^h2) (54c) 

> /(^iV^2; X^lY^'Y^) - /(V^iV^2; X'^|yi"K,"/^i/^2) (54d) 
j(^iV^2; x^\y:^) - Hi^MY^Y^hh^) + Hi^.^^lY^Y^hh^X^XSAc) 

> I{^i^2;X^\Y^^)-n6{e) (54f) 
(/) " 

> 5^/(V'i.^2^;X,|F2.)-ri5(e), (54g) 

where (a) follows because J is a function of X"; (6) follows since hi and /12 are functions of 
(J, F") and (J, Y2), respectively ; (c) follows by using the Markov chain {ipi, ipi, X") — Y^ — F"; 

(d) follows by the chain rule of mutual information and since mutual information is non-negative; 

(e) follows by (|52l) and since entropy is non-negative; and (/) follows by the chain rule for 
entropy, since X" and Y2 are i.i.d., and due to the fact conditioning decreases entropy. 

Similarly, the second term in (I53bl) . namely, /(J; Y2\Yi ), leads to 

I{J;Y2^\Y{^)>I{W,Y2^\Y{^) (55a) 
= /(hi^i; Y^^IYI") - /(V^i; Y^^lY^^h) (55b) 

> J(V^i; Y^\Y,^) - H{^i\Y,^h) + H{^i\Y,^Y^h) (55c) 
>I{^i;Y2^\Y{')-n6{e) (55d) 

> J]/(V'i.;r2,|riO-n5(e), (55e) 

i=l 

where (a) follows because hi is a function of J and Y""; (&) follows by the chain rule of mutual 
information and since mutual information is non-negative; (c) follows by (|52l) and since entropy 
is non-negative; and (d) follows by the chain rule for entropy, since Y2 and Y^' are i.i.d., and 
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due to the fact conditioning decreases entropy. From (|53bl) . (|54g[), and (I55el) . we then have 



nR>J2 nM2^; X,\Y2^) + liiJu; Y2^\Yu) - n5{e) (56a) 

i=l 
n 

Y,HX^■,^li\Yu) + I{Xf,^2^\Y2iA^)-nSie), (56b) 

i=l 

where (a) follows because of the Markov chain relationship {^ii,tp2i) — Xi — ¥21 — Yu, for 
i = l,...,n. By defining Xji = ipji with j = 1,2 and i = l,...,n, the proof is concluded as in 

m. 

Appendix B: Proof of Proposition [2] 

As explained in the text, we only need to focus on the case where D2 < Di < a^. As per 
the discussion in Appendix A, we can assume, without loss of generality, that the Markov chain 
relationship X — Y2 — Y1 holds, so that 

Y2 = X + Z2 (57a) 
andFi = Y2 + Z,, (57b) 

where Zi ~ Af{0, Ni) is independent of (X, Z2). 

We first prove a converse. Calculating the rate-distortion function in ([14)) requires minimization 
over the pmf p{xi,X2\x) under the constraint (fT3l) . A minimizing p{xi,X2\x) exists by the 
Weierstrass theorem due to the continuity of the mutual information and the compactness of 
the set of pmfs defined by the constraint (fT3])[|2T|. Fixing one such optimizing p{xi, X2\x), the 
rate-distortion function (fT4l) can be written as 

i?g|(Di,D2)= I{X;X2\Y2) + I{Xi;Y2\Y^). (58) 

The first term in (|58l) , i.e., I{X; X2\Y2), can be easily bounded using the approach in |l5l p. 
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5007]. Specifically, we have 

I{X;X,\Y2) = h{X\Y2) - h{X\X2Y2) 

= h{X\X + Z2)-hiX-X2\X2,X2 + {X-X2) 

= h{X\X + Z2)-hiX-X2\X2,{X-X2) + Z2) 

> h{X\X + Z2) ~ h{X ~ X2\{X - X2) + Z2) 

(b) I f \ 1 f D 

> - log2 27re ^ - - log2 27re 



olog2 \r - —F^ ' (59) 



2 ^'\al + N2 D2 J 
where (a) follows because conditioning decreases entropy; and (6) follows from the maximum 
conditional entropy lemma [14, p. 21], which implies that h(E\E + Z2) < | log2(27reo"||^^^^) 
with E = X — X2. In fact, we have that (y\\EJ^Zi — ^ since the conditional variance (^'e\e+z2 
is upper bounded by the linear minimum mean square error of the estimate of E given E + Z2. 
This mean square error is given by ^j,^ , since we have E[_E'^] < D2 and since Z2 is independent 
of E due to the factorization (fT2l) and to the independence of X and Z2. For the second term 
in (|58]) . we instead have the following: 



j(Xi;r2in) = /i(r2|>"i)-/^(>^2|>l,Xi 

= -log2 f2vre ^^^^^ + ^-\ ) - ^^21^1, X^). (60) 
Moreover, we can evaluate 

h{Y2\Yi,X^) = h{Y2,Yi\Xi)-h{Y^\Xi) 

= h{Y2\X^) + h{Y,\Y2,X^)-h{Y^\X^) 

= h{Y2\Xi) - h{Y2 + Zi\Xi) + /i(r2 + Zi\Y2,Xi) 

h{Y2\Xi)~h{Y2 + Zi\X,) + ^\og^{27TeN,), (61) 

where (a) follows because Zi is independent of Y2 and of Xi, due to the factorization (fT2l) and 
due to the independence of Zi and X. Next, we obtain a lower bound on the term h(Y2 + Zi\Xi) 
in (|6TI) as a function of /i(F2|-^i) by using the entropy power inequality (EPI) [ |14i p. 22]. 
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Specifically, by using the conditional version of EPI lfT4l p. 22], we have 

(^) 22'i{^2|^i) _(_ 22/1(^1) 

= 22'*(^2l^i) + 27reA^i, (62) 

where (a) follows because Zi is independent of Xi as explained above. The first two terms in 
(I6TI) can thus be bounded as 

h{Y2\Xi) - h{Y2 + < /i(r2|Xi) - ^ log(2''^(^2l^i) + 27reAri) 



2 \^22M^2|Xi) +27reAri 

^ ^^g4 2vre(Di + iV,) + 2vreivJ ' ^^^^ 

where (a) follows because logg (^ ^2fe(y2X^)^2^ jy ) increasing function of h{Y2\Xi) and 

/^(Fal-^i) < I log2(27re(Di + N2)), as can be proved by using the same approach used for the 
bounds (a) and (b) in (l59l) . By substituting (l63l) into (|6TI) . and using the result in (l60l) . we obtain 



/(Xi; Fsl^^i) > log2 27re— ^ - - logg 27re- 



2 V N^ + N2 + al) 2 D, + N2 + N, 

_ 1 ( iN2 + aim + N2 + N,) \ 

- 2'''^'[{N, + N2 + al){D, + N2))- ^^^^ 
Finally, by substituting (|59l ) and (l64l) into (|58]) . we obtain the lower bound 

i?^Bpi,^2) > 2 l^fT^ ■ J + 2 ^"^^ I (AT, + N2 + al) {D, + N2] 

^n. f (A + iVl+iV2)(D2 + iV2) ^ .... 

- 2 + iVi + iV2) ■ (A + iV2)Z}2 ^ ^ 

For achievability, we calculate (fT4l) with X = X2 + Q2 and X2 = Xi + Qi, where Qi ~ 
A/'(0, -Di — D2) and Qa ~ A/'(0,D2) are independent of each other and of {Xi,Zi, Z2). This 
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leads to the upper bound 

R%i{D,,D2) < I{X;X^X2\Y2) + IiXi;Y2\Y^) 
= I{X;X2\Y2) + I{Xi;Y2\Yi) 

= h{X\Y2) - h{X\Y2, X2) + h{Y2\Y^) - h{Y2\Y,, X,) 
= h{X\X + Z2)-h{X2 + Q2\X2 + Q2 + 22^X2) 

+h{X + Z2\X + Z2 + Zi) - h{Xi + Q1 + Q2 + Z2\Xi + Q1 + Q2 + Z2 + Zi,Xi] 
= hiX\X + Z2) - h{Q2\Q2 + Z2) + h{X + Z2\X + Z2 + Zr) 

-HQ, + Q2 + Z2\Qi + Q2 + Z2 + Zi) 

(«) 1, fn ""l \ 1, fn ^2 \ 1, / (^l + N2 

- log2 27re-^ - - log2 2ne—^ + - loga 27re 



2 I T _i_ '^i / 2 I 1 + :22 / ' 2 ^ I 1 _L 

\ N2/ \ ^ N2J \ 

4log2 I 27re-^^ ^ 



( {D, + N, + N2){D2 + N2) \ 

- 2^''^'[ial + m + N2)' {D, + N2)D2 J' ^^^^ 

where (a) follows using /;,(y4|y4 + B) = | logg I J , for A and B being independent 

Gaussian sources with A ~ A/'(0, Sa) and i? ~ A/'(0, ^b). By comparing (fSOl) with (l66l) . we 
complete the proof. 

Appendix C: Proof of (|23]) 
Here, we prove that (|2l) equals (|23l) for the given sources. For the converse, we have that 
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I{X;X\Y) = H{X\Y) - H{X\X,Y) 

= v-vH{X\X,Y^X)-{l-p2)H{X\X,Y = X) 

= p-pH{X\X,Y^X) 

= p-pH{X\X) 

= p-pH{X ®X\X) 

(a) ^ . 

> p-pH{X®X) 

> p-pH{D) 

= p{l-H{D)), (67) 

where (a) follows because conditioning decreases entropy. Achievability follows by calculating 
© with X = X ®Q where Q ~ Ber{D). 

Appendix D: Proof of Proposition [3] 

As explained in the text, we only need to focus on the case where D2 < Di < 1/2. As for 
Appendix A and Appendix B, we can assume, without loss of generality, that the joint pmf of 
{x,yi,y2) factorizes as (|5TI) as shown Fig.|71 We first prove a converse. Similar to (|58l) . we can 
write the rate-distortion function (fT4l) as 

R%'l{D,,D2)= I{X;X2\Y2) + I{Xi;Y2\Y^), (68) 

where the mutual information terms are calculated with a distribution p{xi,X2\x) minimizing 
(fT4l) under the constraint (fT3l) . The first term in (1681) . i.e., /(X;X2|F2), can be easily bounded 
by following the same steps used in the derivation of (l67l) . leading to 

I{X;X2\Y2) > P2{l-H{D2)). (69) 
For the second term in (|68] ). we instead have the following: 

I{X,;Y2\Y,) = H{Y2\Y,)-H{Y2\Y,,X,) 

= H{Y2\Y,)-H{Y2,Y,\X,) + H{Yi\Xi) (70) 

= H{Y2\Y,)-H{Y2\Xi)-H{Y^\X^,Y2) + H{Y^\X^) (71) 

H{Y2\Y,)-H{Y2\X,)-H{Yi\Y2) + H{Yi\Xi), (72) 
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where (a) follows because of the Markov chain condition Yi — Y2 — Xi. The second term in the 
right-hand side of (fTll) can be evaluated as 

H{Y2\Xi) = H{Y2,X\X,)-H{X\Y2,Xi) 

= H{X\X,) + H{Y2\X,X,)-H{X\Y2,Xi) 

= H{X\X{) + H{Y2\X) - P2H{X\Y2 ^ X, X^) - (1 - P2)H{X\Y2 = X, X,) 

H{X\X,) + H{p2) - P2H{X\X,) 
= H{p2) + {1-P2)H{X\X{) (73) 

where (a) follows because H(Y2\X) = H{p2). The fourth term in the right-hand side of (|72|) 
can similarly be evaluated as 

H{Y,\Xi) = H{p,) + {l-pi)H{X\X,). (74) 

Substituting (|73l) and in (|72l), we obtain 

I{Xi;Y2\Y,) = H{p,) + {l-p,)H{X\X,)-{H{p2) + {l-p2)H{X\X^)) 
+H{Y2\Y^)-H{Y^\Y2) 
= H{pi) - H{p2) - {pi - P2)H{X\X,) + H{Y2) - H{Yi) 
> {pi-P2)-{pi-P2)H{D,) (75) 

where (a) follows since H(Y2) = H{p2) + (1 — P2) and H(Yi) = H{pi) + (1 — pi) and due to 
the inequality H{X\Xi) < H{Di). Substituting (|75]) and ([691) into dMI), we obtain 

i?gl(A,D2) > P2{l-H{D2)) + {pi-p2){l-H{D^)) 

= pr{l-H{D,))+p2{H{D{)-H{D2)). (76) 

For achievability, we calculate (fT4l) with X = X2 ® Q2 and X2 = Xi (B Qi, where Qi ~ 
Ber(Di * D2) and Q2 ~ Ber(D2) are independent of each other and of {Xi,Ei,E2) where 
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Ej = l{Yj = e} for j = 1, 2. This leads to the upper bound 

R%%{D,,D2) < I{X;X,X2\Y2) + I{Xi;Y2\Y^) 
= I{X;X2\Y2) + I{Xi;Y2\Y,) 

= H{X\Y2) - H{X\Y2, X2) + H{Y2\Y^) - H{Y2\Y,, X^) 

= P2 - P2H{X\X2, Y2^X)-{1- P2)H{X\X2, Y2 = X)+ p,H 



(^) 



- P2) - PiHiY^lX^, Fi = e) - (1 - p,)HiY2\X,, Y, = X) 
= P2-P2H{X\X2)+Pih(^^^+Pi{1~P2) 

-Pi 

= P2-P2H{X2(BQ2\X2)+Pi{l-P2) -piil-p2)H{X,(BQi(BQ2\X,) 

= P2 - P2H{D2) +Pl{l- P2) -Pl{l- P2)H{D,) 

= p,{l-H{D,))+p2{H{Di)-H{D2)), (77) 

where (a) follows because H{Y2\Yi) = piH (j^^ P2); (b) follows because H{Y2\Xi, Yi = 

X) = H{X\X,, Fi = X) = and H{Y2\X,, Y, = e) = H (^f^ + Ml^/7(x|Xi); (c) follows 
by using the inverse test channels X = X2 ® Q2 and X2 = Xi ® Qi, and {d) follows because 
Q2 ~ Ber(D2) and Qi® Q2 ^ Ber(Di). By comparing (1761) with (TTTl) . we complete the proof. 

Appendix E: Proof of Proposition [9] 

Here we provide the proof of Proposition HI To this end, we prove that for any pair (Di, D2) 
there exists a joint distribution X2\x) such that (fT3l) is satisfied and the conditions (I43al) and 
(I43bl) coincide with (I42al) and (I42bl) . respectively. This entails that the inner and outer bounds 
of Proposition |7] and Proposition [8] coincide. 

We distinguish the four region in the (Di, D2) plane depicted in Fig. [51 If -Di > a1 and D2 > 
al, it is enough to set Xi = X2 = in (l43l) to prove. For Di < crl and D2 > min(Di,cr^), 
we instead set Xi = X2 and X = Xi + Qi in (|43l) . where Qi ~ A/'(0, Di) is independent of 
Xi. Following the discussion in Sec. III-C[ it is easy to see that this choice is such that (|43l) 
coincides with (l42l) . Next, in the sub-region where Di> 0I and D2 < 0I, we select Xi = 
and X = X2 + Q2 in (|43l) . where Q2 ~ A/'(0, -D2) is independent of X2- Finally, for the region 
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in Fig. [51 for which D2 < Di < a^, we choose X = X2 + Q2 and X2 = + Qi, where 
Qi ~ Af{0,Di — D2) and Q2 ~ A/'(0,Z^2) are independent of each other and of (Xi, Ei, i?2)- 
With this choice, following the derivations in Appendix B, we conclude that condition (|43al ) 
coincides with (|42al ). As for (|43b| ). we proceed as follows: 

I{X-X^\Y2)+I{X-X2\X{f2) = I{X-X^X2\Y2) 

= h{X\Y2)-h{X\X^,X2,Y2) 
= h{X\X + Z2) 

-h{X^ + Qi + Qal^i, Xi + Qi, Xi + Qi + Q2 + ^2) 
= h{X\X + Z2)-h{Qi + Q2\Qi,Qi + Q2 + Z2) 
= hiX\X + Z2) - hiQ2\Q2 + Z2) 

- 5'-^(TTf)-5'°^'^(TTl) 

1 f al D2 + N2\ 
= 2''''^U + N2 D2 ) 

= R^''{D2,N2), (78) 

which concludes the proof. 

Appendix F: Proof of Proposition [IO] 

Here we provide the proof of Proposition \T0\ Following similar steps as in Appendix E, we 
prove that for any pair (Di,D2) there exists a joint distribution p(xi,X2\x) such that (fT3l) is 
satisfied and the conditions (143 al) and (I43bl) coincide with (I42al) and (I42bl) . respectively. This 
entails that the inner and outer bounds of Proposition |7] and Proposition [8] coincide. 

We distinguish the four region in the {Di, D2) plane depicted in Fig. [8l If > 1/2 and D2 > 
1/2, it is enough to set Xi = X2 = in (1431) to prove the desired result. For Di < 1/2 and D2 > 
min(£)i, 1/2), we instead set Xi = X2 and X = Xi © Qi in dH]), where Qi ~ Ber(Di) is 
independent of Xi. Following the discussion in Sec. lITDl it is easy to see that this choice is such 
that (|43l) coincides with (|42)) . Next, in the sub-region where Z^i > 1/2 and D2 < 1/2, we select 
Xi = and X = X2 © Q2 in (|43l) . where Q2 ~ Ber(D2) is independent of X2. Finally, for the 
region in Fig. [H for which 1^2 < -Di < 1/2, we choose X = X2 ©(52 and X2 = -^1 ©Qi, where 
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Qi ~ Ber(Di * D2) and Q2 ~ Ber(D2) are independent of each other and of (Xi, Ei, E2). With 
this choice, following the derivations in Appendix D, we conclude that condition (|43a|) coincides 
with (|42a| ). As for (|43bk we proceed as follows: 

I{X-X^X2\Y2) = H{X\Y2)-H{X\X,,X2,Y2) 

= P2- P2H{X\X,, X2, Y2^X)-{1- p2)H{X\X^, X2, Y2 = X) 

= P2-P2H{X\X^,X2) 

P2-P2H{X\X2) 
= P2-P2H{X2®Q2\X2) 
= P2- P2H{D2) 

= Rg\D2.P2), (79) 
where (a) follows by the Markov chain relationship X — X2 — Xi. This completes the proof. 

Appendix G: Proof of Proposition [U] 

The proof of the achievability follows from standard arguments, similar to |l6l. For the converse, 
following the proof of (61 Theorem 3] we have that for any (i?, D^ i + e, De,2 + ^^Di + e, D2 + e) 
code, the following inequality holds: 

n 

nR > Uu\Yu) + I{Xf, U2^\Y2^), (80) 

i=l 

with the definitions Uji = (J,F/^'), for j = 1,2, with y/^* = ^^Xi+i)]- Note that with 

the given definition of Uji we have that the ith element of the decoding functions dS])-© can 
be written as hji{J,YJ^) = kji{Uji,Yji) for all i = l,...,n and j = 1,2. Now, defining -Deji = 
E[de,j{h'j^{M,YJ^),^pji{X'^)], we have the following chain of inequalities for the code at hand 
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and j = 1, 2: 

D,j, = Ex^YpideAhUJ^yD^M^""))] (81a) 

Ex^u,,Y,AdeA^ji{Uji,Yj,),MXl)] (81b) 

= Exnt/,,Ey^J4,,(%(?7,„ y,,), X"V))|X"f/,,] (81c) 

n 

J2 p(x^M,,)EK,, (8id) 

> (81e) 

n 

^ Mji)Ey^,, [de,j{Xji{Uji, Yji) , Xeji{Uji, Xi))\Xi = Xj, Uji = Uji] (Slf) 

= 'ExiUjiYji[de,j{'^ji{Uji,Yji),Xf,,ji{Uji, Xi))], (81g) 

where (a) follows by using the definition of random variables Uj = (J, 1^"^*); (b) follows by 
selecting x*''^'^^{xi,Uji) as 

x*"''^'{xi,Uji) E argmin^„\>g;j,„\, 

Ey^,J4,,(%([/,„y,,),V^,,(X„Xf\*))|X, = x^,X^\' = x^\\U,, = u,,]; 

and (c) follows from the Markov chain relationship Yji — (Xj, t/jj) — X"^* and from the definition 
Xe,ji{Uji, Xi) = ipji(Xi,x*'^\'^{Xi,Uji)). Let Q be a uniform random variable over the interval 
[1, n] and independent of the variables (X", Y^, Y^ , f/f , f/2", A", ^2", ^"1. ^"2) and define the 
random variables Vj = {Q, I/jq), X = Xq, Yj = YjQ, Xj = Xjq, and Xej = X^jq for j = 1, 2. 
Moreover, note that Xj is a deterministic function of Uji and Yji, and Xej is a deterministic 
function of Uji and Xj for j = 1,2. The proof is completed by using (|45l) and the fact that 
the term /(X^; f/ijlFij) + /(Xj; U2i\Y2i) in (ISO)) is convex with respect to the pmf p{uu,U2i\xi), 
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